Comparing virtual reality and simulation to teach the assessment and management of acute surgical scenarios: A pilot study

Abstract Background and Aims Traditional apprenticeship‐based surgical training presents with challenges, especially in acute scenarios. Simulation provides the current standard of facilitating surgical training in a low‐risk environment but is restricted by limited accessibility and high costs. Virtual reality (VR) offers immersive three‐dimensional computer‐generated training scenarios and can connect users from various locations. We aimed to compare the performance of junior doctors to manage an acute surgical scenario using VR and mannequin‐based simulation. We hypothesised that VR would be as effective as mannequin‐based simulation in performance outcomes. Methods This multicentre, randomised controlled pilot study was conducted with eighteen junior doctor volunteers (Foundation and Core Trainee Year 1). Ten were randomly allocated to VR and eight to mannequin‐based simulation. Participants completed questionnaires and a 15‐min pneumothorax scenario. Quantitative metrics included overall score, time‐to‐critical decisions, and academic buoyancy scores (ABS). Qualitative metrics included participants' likes and dislikes of their allocated simulation modality. Results VR participants scored significantly higher than mannequin‐based simulation participants in overall scores (74.30% (SD ± 5.08%) vs. 59.75% (SD ± 10.14) (p = 0.04)), and technical skills aspects (77.20% (SD ± 8.01%) vs. 65.00% (SD ± 8.21%) (p = 0.01)). Mannequin‐based simulation participants initiated critical decisions faster and demonstrated a trend towards a faster mean time‐to‐completion (p = 0.06). ABS scores increased for both study groups, though was only significant for VR participants (p ≤ 0.01). VR participants liked how VR fostered independent learning but disliked the formulaic content and impaired communication‐learning compared to mannequin‐based simulation. Conclusion Both VR and mannequin‐based simulation training are effective in training junior doctors in acute surgical scenarios but present different educational benefits. Future research should recruit a larger sample size for a full comparative randomised controlled trial.


| BACKGROUND
Surgical training is historically an arduous journey.Traditionally, surgical procedures were taught using the apprenticeship-based Halstedian model of "see one, do one, teach one". 1,2Whilst this fosters a peer-assisted learning culture, it is criticized by concerns to patient safety as trainees, who may lack the skills, practice procedures on real patients.Further factors such as time constraints and highpressure environments make the operating theatre unsuitable as a classroom for high-risk, acute surgeries.
Up to 42% of residents feel inadequately trained to safely perform various procedures unassisted for the first time, although the true prevalence may be higher. 3,4Halstead's model is also restricted by the 48-h work week, 5 which whilst implemented for health and safety, these shorter hours make sourcing additional time for surgical exposure alongside numerous clinical demands difficult. 6Furthermore, the COVID-19 pandemic negatively impacted surgical training opportunities, with progression within training affected by cancellations of elective procedures and redeployment of surgical trainees. 7ese negative effects were felt globally, with burnout rates amongst surgical trainees peaking as high as 95.2%. 8 recent years, the surgical curriculum has evolved from a numbers-based mindset which focuses on counting how many procedures a trainee has performed, to a competency-based mindset where trainees are evaluated on what procedures they can perform independently and how well they can do so.This shifts the training approach from assuming that exposure to procedures over a fixed timeframe will be sufficient to achieve competency, to a more encompassing approach which ensures that trainees are obtaining the knowledge and skills needed to become competent. 9Considering different ways trainees learn helps to effectively inform teaching strategies, especially as new surgical techniques and learning modalities are continually being implemented in the surgical curriculum.
Simulation is a widely used and validated form of surgical training which provides safe and structured opportunities to practice skills outside the operating theatre.However, hiring equipment, space and personnel needed to run simulations can be expensive.Medical school simulation examinations can cost greater than £355 per student. 10Cadaver-based simulation is especially costly, considering the limited supply and storage. 11 Virtual reality (VR) offers a significant advance for surgical simulation.VR is a computer-generated environment, where individuals can interact with a fully immersed three dimensional virtual world through a head-mounted display. 12Having been popularised by the gaming industry, the cost of consumer-grade headsets has become more affordable. 13VR has been used previously by the aviation sector for flight simulation, 14 and military sectors for replicating battlefield scenarios. 15Its portability allows teaching to be conducted in convenient spaces, whilst also connecting multiple users across different locations, saving on time and travel. 16,17Highfidelity VR simulators can be very realistic, and developers can tailor experiences with various complexity levels.[20] Favourable outcomes from VR-based training have been demonstrated.Harrington et al. and Colonna et al. both demonstrated that their trauma simulators were able to distinguish decision-making abilities between trainees of different levels, with higher scores achieved by those with more training experience. 21,22Kiyozumi et al. showed that their VR pre-hospital trauma evaluation exercises allowed completion of more training scenarios in a shorter amount of time, compared to standard face-to-face training, whilst being able to facilitate competency in initial assessment procedures. 23tcomes of simulation can be broadly divided into technical or nontechnical skills acquisition.Technical skills in trauma involves interventions such as chest drain insertions and pericardiocentesis, whilst nontechnical skills involve leadership and management. 24Such skills can be examined in standardised training courses, like the ubiquitous Advanced Trauma Life Support (ATLS) course, which has reformed trauma management in more than 60 countries. 25aminations are commonly assessed using Objective Structured Clinical Examination (OSCE) frameworks, whereby candidates are objectively marked against standardised scoring sheets whilst completing timed activities in simulated stations. 26ring simulation, stress and self-perceived poor task execution may hinder trainee performance.Academic buoyancy may protect from the effects of psychological distress. 27,28The academic buoyancy scale (ABS) is a validated scoring system created by Marsh et al. which reflects a learner's ability to successfully deal with short-term, minor academic setbacks, such as poor grades and exam pressures. 27It consists of four questions on a 7-point Likert scale; learners who rate themselves with higher scores reflect higher academic buoyancy, which can translate into increased long-term resilience (Supporting Information S1: Appendix S1).Analysing simulation results in the context of ABS scores may help to better understand overall performance.

| Aim
VR is not yet a validated training modality compared to simulation, being a more recent development.This pilot study aimed to compare the performance of junior doctors to assess and manage an acute surgical scenario using VR and mannequin-based OSCE simulation.
We analysed participants' objective performance metrics, subjective perceptions on their approach to the scenario, and any correlations between ABS and OSCE scores.Furthermore, we evaluated feasibility for a subsequent full comparative randomised controlled trial.
We hypothesised VR would be as effective as mannequin-based simulators in the assessment and management of acute surgical scenarios.Eighteen volunteer participants were recruited and randomised to either Simulation or VR (Figure 1), where the first 10 participants to volunteer were allocated to VR and the remainder to Simulation.

| METHODS
Participants were allocated a study number to de-identify their data.All participants provided written informed consent and could withdraw their data anytime until the point of statistical analysis, after which it would be difficult to separate the deidentified data.Basic demographics including age, grade and previous exposure to surgical specialities were recorded.Ethical approval was granted by the Education Ethics Review Process board at Imperial College London.
The Oculus Meta Quest 2 (Facebook, Meta Platforms, Inc.) 29 was used for VR simulation, featuring an immersive 360°view.The display was casted onto a secondary monitor to simultaneously observe the participants' viewpoints.The Oxford Medical Simulation (OMS) software application was used with licence agreement. 30VR participants undertook a 30-min VR tutorial to familiarise themselves with the equipment and navigation.

| RESULTS
Eighteen junior doctors were recruited with nine FY1, five FY2, and four CST (Table 1).Ten participants were allocated to VR and eight to Simulation.The mean age was 26.0 years (SD ± 2.2 years), with Previously been part of a trauma assessment team?
Previously worked in any specialities?

General surgery 1 2
Major trauma 1 1 eleven males and seven females.Most participants had a general or vascular surgery background, were interested in acute surgical management, had undertaken previous simulation-based courses and had not been part of a trauma assessment team.Most had not used the Oculus headset and, of those who had, was primarily for gaming purposes.

| Quantitative analysis
Participants' self-rated confidence scores in the assessment and management of acute surgical scenarios improved from 4.56 out of 10.00 (SD ± 1.69) pre-session to 7.56 (SD ± 0.92) post-session (Tables 2a and 2b).
Both Note: Rating on a scale of 1-10, with 1 being least and 10 being most.Reported as mean and standard deviation (SD).
T A B L E 2b Improvement and confidence of the early assessment and management of acute surgical scenarios following the session.with OSCE scores, no significant correlation was found, with an r-value of 0.02 (p = 0.94).These results are graphically represented in Supporting Information S1: Appendix S8.

| Qualitative analysis
Nine of 10 VR participants would like to have the option of being taught using VR in the future, with the remaining participant stating "in conjunction with simulation."All Simulation participants would like to have the option of being taught using simulation in future.
What participants liked and disliked about their simulation modality are presented in Table 6a, categorised by common themes.
Similarly, common perceived advantages of participants' simulation modality are presented in Table 6b, categorised by common themes.

| DISCUSSION
In summary, VR participants had a significantly higher overall OSCE score compared to Simulation participants and scored significantly higher in technical aspects.Simulation participants initiated critical decisions and completed the scenario faster, although faster completion times do not necessarily equate to having achieved competence.
VR participants also had significantly improved ABS scores, suggesting that having access to the VR teaching reinforced their academic buoyancy which, over a period of time, can translate to resilience.
Whilst ABS scores improved in the VR group more than in Simulation, though they did not correlate with the overall marks, it supports the proposition that VR may provide a more comfortable and self-directed environment for participants to hone their clinical Comparison of scenario completion time and time-to-critical decision.T A B L E 6a Likes and dislikes about virtual reality and Simulation.

Safe setting
• "Felt like a safe and supported setting, not stressed about making mistakes." • "Did not worry about making mistakes"

Realistic
• "Felt very realistic in terms of ordering and viewing investigations" • "Very realistic.In some elements, more realistic than sim[ulation]" • "Felt quite real" • "Very close to real life" • "Very immersive"

Interactive
• "It was very interactive and the scenarios were engaging" • "Remembering all the aspects/information available to you and allocation of duties when the options are available in the drop-down menu" Independent learning • "Can continue practising scenarios in which participants individually lack confidence" • "Can repeat sessions in own time if desired" • "Can go through scenarios in own time" • "Once you practice, you can really get comfortable with a range of scenarios before seeing them on the ward.I found it really beneficial to myself as a visual learner and wish there were more opportunities to learn via this method" Resource-efficient • "Very effective form of self-directed learning, especially if other people are not available to run sim[ulation] sessionswhich can be quite resource-intensive" • "Efficient teaching method requiring less manpower"

Feedback
• "Great to get feedback straight after" • "Allows students to get immediate feedback" • "Methodical mark system at end"

Adjustment
• "Takes some time to get used to the headset and the programme.However, the gain to knowledge over time if you are given lots of exposure to this form of learning" • "A little disorientating at first" • "VR did not let me do what I wanted and I couldn't talk to the patient or do activities simultaneously" • "I also felt that some of the options were slightly unrealistic"

Technical
• "1 glitch on the programme but minimal impact to learning" • "Lack of peripheral vision" • "More difficult to locate visual prompts" • "Headset is slightly bulky and can be uncomfortable if wearing spectacles."

Communication
• "In real simulation, it is helpful having a team member in person that you can verbally communicate to and think aloud" • "Lack of communication" • "Not able to speak to the patient" • "Communication learning may be slightly impaired as the scenario phrases questions for you" Content structure • "Tick box" • "More formulaic, with specific options of management already laid out in front of you" • "In real life and in previous simulation training I've done, it's been less prescriptive and directed" • "It also gave me options rather than let me decide what I wanted to do spontaneously" Practical skills • "Does not help assess the practical skills e.g.putting in a chest drain, doing an ABG [arterial blood gas] etc."

Theme Likes about mannequin-based simulation
Reflects real-life • "Practicing acute scenarios is essential for real-life situations" • "Allows you to feel the pressure of real-life scenarios"

Natural
• "More natural, easier to make decisions and do things following the natural flow" • "More relaxed" • "More freedom to go off-piste and still get to the answer" (Continues) TRAN ET AL.
| 7 of 11 skills.Meanwhile, Simulation could be perceived as more stressful due to the in-person presence and direct observers, which could have contributed to their lower ABS scores.In clinical practice, trainees are not being observed to the same extent as in a simulated examination, allowing VR training to more reflective of the working environment.
Nonetheless, stress is inevitable in acute surgical training so it would benefit trainees to learn ways to manage stress in practice scenarios.
From qualitative feedback, it is clear that VR is favoured in terms of its setting, efficiency and accessibility, and offers a well-received adjunct to learning to complement existing methods or for sole use where time, financial and geographical factors can otherwise hinder accessibility.
The options available in the drop-down menu provide prompts for VR participants to choose actions from which could make the scenario easier and too formulaic.It also hinders the ability to multitask, whereby VR participants could only execute one action at a time, whilst Simulation participants could, for example, take a history and examine the patient simultaneously.This fosters their communication skills unlike in VR whereby the scenario phrases questions for Theme Likes about mannequin-based simulation

Interactive
• "Hands on, requires communication and practicing those skills" • "Sim[ulations] are good to get confident in the practice of communication and managing under pressure" • "Interactive learning" • "Engaging" • "It is very interactive" Theme Dislikes about mannequin-based simulation

Unrealistic
• "Sim[ulation] does not reflect a real patient, therefore harder." • "Sometimes not realistic/low fidelity because you don't actually take the bloods etc" • "Unable to physically examine a patient" T A B L E 6b Perceived advantages of virtual reality and simulation.

Safe setting
• "Can be more self-directed, can be better for students/participants who are not confident/loud in group sim[ulation] settings" • "Feels like a safe space to make mistakes and learn" • "More comfortable with less of an audience." • "Less stress than SIM session as a group." • "Gives hospital exposure in a safe environment." • "Feels safe and supported, less worried about making mistakes as real simulation can sometimes feel intimidating"
• "Immediate response and no issues or delays with mannequins, telephone calls or imaging requests" • "Potentially cheaper and less resource heavy than traditional in-person simulation" • "Relatively resource cheap compared to SIM rooms and staffing required to run sessions."• "Can implement a huge number of scenarios and promote independent learning." • "Great for visual learners.Great for repetition of acute scenarios and building confidence.Able to practice more in own time."

More skills targeted
• "enforces more reflective practice" • "Useful to feel like you are in the space and communicating, and also able to practice with a mannikin" Unsupervised practice The limitations of this study include a small and slightly unequal sample size, collected using convenience sampling, meaning the results are not necessarily generalisable.The method of randomly allocating participants on a first-come first-served basis was used to overcome the logistical difficulties encountered during recruitment.
There were few convenient times where doctors were available to convene on a specific date outside scheduled working hours which also coincided with availability of the Simulation suite, hence the lack of Simulation participants.Therefore, this method of "randomization" allowed data collection for the VR cohort to begin before volunteers became unavailable, to prevent delays in data collection, whilst recruitment for the Simulation cohort was ongoing.Inadvertently, this highlights the convenience and portability of VR as a teaching modality.There is also inherit self-selection bias within this volunteer sample towards learners and acute surgical speciality trainees.
Furthermore, simulation is usually group-based, requiring trainees to be physically present at the same time.The time-consuming efforts required to travel to different training centres makes in-person simulation less accessible to those located in rural areas.
This was a multicentre, randomised controlled pilot study conducted across two London hospital sites between March 2023 and May 2023.Recruitment emails and posters were distributed amongst Foundation Year 1 and 2 (FY1/FY2) and Year 1 Core Surgical Trainees (CST).Exclusion criteria involved previous ATLS course attendance to ensure similar educational maturity of participants.
The time-limited 15-min VR acute surgical scenario featured a pneumothorax case, where an acutely tachypnoeic patient presented on a background of chronic obstructive pulmonary disease.Management of pneumothorax is part of the ATLS curriculum and Intercollegiate Surgical Curriculum Programme for core surgical training.A virtual nurse was present as part of the scenario to assist the participant.The OMS case profile and learning objectives are outlined in Supporting Information S1: Appendix S2.Participants were tasked to assess and manage the patient, and possible actions were presented in a drop-down list (Supporting Information S1: Appendix S3).Following the scenario, OMS provided feedback on areas of good performance, improvement, and an overall score.An in-person 15-min mannequin-based simulation ('Simulation') OSCE was adapted using the same pneumothorax scenario, where participants were tasked to independently assess and manage the patient.An in-depth Simulation tutorial was not necessary as participants already had simulation exposure from their undergraduate and Foundation Year training.Participants were instead introduced to the simulation room and shown where relevant instruments were located, like in the VR tutorial.A Simulation marking scheme was created following OMS and OSCE marking frameworks (Supporting Information S1: Appendix S4).Two human assessors, who hold ATLS qualifications and interests in medical education, were used for Simulation to reduce grading bias and assessed the participants independently from each other.All participants completed a pre-teaching questionnaire to assess their confidence in managing acute surgical scenarios, and any previous simulation or VR exposure.Post-teaching questionnaires were also completed.Performance metrics were categorised into quantitative and qualitative measures.Quantitative measures: 1. Overall OSCE simulation scores 2. Skills domains scores: Communication, Nontechnical, Teamwork and Technical aspects, as pre-determined by OMS (Supporting Information S1: Appendix S5) 3. Scenario completion time and time-to-critical decisions, as predetermined by OMS F I G U R E 1 Consort diagram.4. Likert scale ratings of participants' preparedness and confidence of the scenario 5. Pre-scenario and post-scenario ABS scores Qualitative measures: 1.Whether participants would like to be taught using their allocated modality again 2. Likes and dislikes about their allocated teaching modality 3. Advantages or disadvantages of their allocated modality Statistical analysis was performed using Statistical Product and Service Solutions (IBM Corp. 2020.IBM SPSS Statistics for Windows, Version 27.0).The Shapiro-Wilk test was used to assess normality.The Independent T-Test and Mann-Whitney U Test compared overall OSCE and skills domains scores, and time-to-critical decisions.The paired T-test and Wilcoxin signed rank test compared ABS scores.Pearson Correlation Coefficient evaluated any correlation between overall OSCE and ABS scores.Intraclass Correlation Coefficient calculated inter-rater reliability between the human Simulation assessors.A p-value < 0.05 was considered statistically significant.A power calculation was not considered necessary for this pilot study.

T A B L E 1 1 a
Demographic data of participants.Interested in acute surgical management scenarios?Undertaken previous BLS/ALS/ALERT/ATLS courses?

Feedback • "
Allows students to get immediate feedback" • "Useful to see clear feedback points regarding what I did well and what I missed, and the timing of each step."Independentlearning• "Can repeat sessions in own time if desired".

Furthermore, the more
subjective critical nature of the Simulation marking, as previously explained, could have contributed to the Simulation participants' lower scores overall, making it an inaccurate reflection of the VR group's relative performance.Future studies could recruit a larger sample size, with scope for a priori sub-group analyses between novices and advanced clinicians to examine the construct validity of the VR software.Investigating the degree to which VR-based training stimulates meaningful learning which translates into real-world performance benefits is important.Transfer quality, which reflects effective learning, would ideally involve evaluation of skills and competences in the actual clinical setting after VR training.Indirect predictors of skills transfer include measurement of simulation validity, that is whether accurate and immersive conditions of real-world scenarios are replicated, engagement of learning strategies and retention of learning outcomes.Research should also establish whether spaced VR teaching sessions translate to improved long-term markers of resilience.A costeffectiveness analysis between VR and Simulation would be useful for educational providers and could provide insight into use in lowresource settings.Areas to investigate include initial starting costs, scenario development, and maintaining internet access.5 | CONCLUSION In this randomised controlled pilot trial of 18 participants, VR participants scored significantly higher than Simulation participants during an acute surgical scenario, whilst Simulation participants initiated critical decisions and completed the scenario faster.VR benefits by allowing off-site training and improves short-term markers of confidence.Where VR prevails in aspects such as fostering independent learning and allowing immediate feedback, it lacks elements of what Simulation provides participants with, including the opportunity to practice communication skills and make clinical decisions following a more natural flow.Overall, both VR and mannequin-based simulation training methods are effective educational modalities which can be used to train junior doctors in acute surgical scenarios but present different educational benefits.Future research should conduct a full comparative randomised control trial with a larger sample size and perform a cost-effective analysis.
VR and Simulation groups saw an overall increase in postscenario ABS scores, with the mean VR ABS score increasing from 17.50 (SD ± 2.88) to 19.10 (SD ± 3.18), and from 17.25 (SD ± 5.60) to 17.75 (SD ± 4.98) for Simulation, though this increase was only significant for VR (p = 0.01) (Table5).When ABS scores were compared T A B L E 2a Preparedness and confidence of the early assessment and management of acute surgical scenarios before the session.
Rating on a scale of 1-10, with 1 being least and 10 being most.Reported as mean and standard deviation (SD).Comparison of Objective Structured Clinical Examination scores and skills domains.
T A B L E 3Note: Significance is calculated between VR and simulation scores.Reported as mean and standard deviation (SD).An asterisk denotes p-value is significant, whereby * means p ≤ 0.05, and ** means p ≤ 0.01.
Reported as mean and standard deviation (SD).An asterisk denotes p-value is significant, whereby ** means p ≤ 0.01 and *** means p ≤ 0.001.No mean or standard deviation (SD) available for "verify name and date of birth" as too few participants performed the action.Critical decisions not performed by some participants were classed as missing data.One virtual reality (VR) participant did not perform a chest drain, three VR participants did not assess the airway.Seven VR and three Simulation participants did not verify the patient's name and date of birth.
T A B L E 5 Comparison of pre-ABS and post-ABS scores.Note: Reported as mean and standard deviation (SD).An asterisk denotes p-value is significant, whereby ** means p ≤ 0.01.
• "allows me to practice sims outside a supervised environment, sims are really rare in my med[ical]school teaching" This study's strengths lie in contributing to a small pool of literature, as there is a paucity of research directly comparing the use of VR and Simulation to train doctors in acute surgical scenarios.It is also the first to look at ABS changes in VR for surgical teaching.Bias was reduced where possible.The study has demonstrated excellent inter-rater reliability in marking Simulation scenarios.Grading bias there was inadequate evidence to suggest that VR can facilitate training for open trauma surgery or replace cadavers.This was mainly attributed to poor study designs and low sample sizes, which most studies were subject to, the latter of which applies to this present study.However, it is worth noting that training to manage an ATLS scenario is markedly different from training for open trauma surgery.